A successive state and mixture splitting for optimizing the size of models in speech recognition
نویسندگان
چکیده
A Successive State and Mixture Splitting (SSMS) algorithm for optimizing the size of models used in speech recognition for small size of mobile devices is proposed in this paper. The proposed algorithm employs essentially Continuous Hidden Markov Model (CHMM) structure and this CHMM consists of variable parameter topology in order to minimize the number of model parameters and to reduce recognition time. SSMS splits the Gaussian Output Probability Density Distribution (GOPDD) for variable parameter context independent model. Unlike the Successive State Splitting generating context dependent model, the algorithm constructs context independent model with suitable number of states and mixtures for each recognition units by automatic splitting of GOPDD in time and mixture domain. The recognition results showed that the proposed SSMS could reduce the total number of Gaussian up to 40.0% compared with the fixed parameter models at the same performance in speech recognition.
منابع مشابه
Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملAutomatic generation of context-independent variable parameter models using successive state and mixture splitting
A Speech and Character Combined Recognition System (SCCRS) is developed for working on PDA (Personal Digital Assistants) or on mobile devices. In SCCRS, feature extraction for speech and for character is carried out separately, but recognition is performed in an engine. The recognition engine employs essentially CHMM (Continuous Hidden Markov Model) structure and this CHMM consists of variable ...
متن کاملA Speech and Character Combined Recognition Engine for Mobile Devices
A Speech and Character Combined Recognition Engine (SCCRE) is developed for working on Personal Digital Assistants (PDA) or on mobile devices. In SCCRE, feature extraction from speech and character is carried out separately, but recognition is performed in an engine. The recognition engine employs essentially CHMM (Continuous Hidden Markov Model) structure and this CHMM consists of variable par...
متن کاملOptimal tying of HMM mixture densities using decision trees
Decision trees have been used in speech recognition with large numbers of context-dependentHMM models, to provide models for contexts not seen in training. Trees are usually created by successive node splitting decisions, based on how well a single Gaussian or Poisson density fits the data associated with a node. We introduce a new node splitting criterion, derived from the maximum likelihood f...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006